Search results for "feature selection"
showing 10 items of 139 documents
Feature selection with Ant Colony Optimization and its applications for pattern recognition in space imagery
2016
This paper presents a feature selection (FS) algorithm using Ant Colony Optimization (ACO). It is inspired by the particular behavior of real ants, namely by the fact that they are capable of finding the shortest path between a food source and the nest. There are considered two ACO-FS model applications for pattern recognition in remote sensing imagery: ACO Band Selection (ACO-BS) and ACO Training Label Purification (ACO-TLP). The ACO-BS reduces dimensionality of an input multispectral image data by selecting the “best” subset of bands to accomplish the classification task. The ACO-TLP selects the most informative training samples from a given set of labeled vectors in order to optimize the…
Computational Methods in Developing Quantitative Structure-Activity Relationships (QSAR): A Review
2006
Virtual filtering and screening of combinatorial libraries have recently gained attention as methods complementing the high-throughput screening and combinatorial chemistry. These chemoinformatic techniques rely heavily on quantitative structure-activity relationship (QSAR) analysis, a field with established methodology and successful history. In this review, we discuss the computational methods for building QSAR models. We start with outlining their usefulness in high-throughput screening and identifying the general scheme of a QSAR model. Following, we focus on the methodologies in constructing three main components of QSAR model, namely the methods for describing the molecular structure …
Evaluation of the effect of chance correlations on variable selection using Partial Least Squares -Discriminant Analysis
2013
Variable subset selection is often mandatory in high throughput metabolomics and proteomics. However, depending on the variable to sample ratio there is a significant susceptibility of variable selection towards chance correlations. The evaluation of the predictive capabilities of PLSDA models estimated by cross-validation after feature selection provides overly optimistic results if the selection is performed on the entire set and no external validation set is available. In this work, a simulation of the statistical null hypothesis is proposed to test whether the discrimination capability of a PLSDA model after variable selection estimated by cross-validation is statistically higher than t…
Coupled variable selection for regression modeling of complex treatment patterns in a clinical cancer registry.
2013
For determining a manageable set of covariates potentially influential with respect to a time-to-event endpoint, Cox proportional hazards models can be combined with variable selection techniques, such as stepwise forward selection or backward elimination based on p-values, or regularized regression techniques such as component-wise boosting. Cox regression models have also been adapted for dealing with more complex event patterns, for example, for competing risks settings with separate, cause-specific hazard models for each event type, or for determining the prognostic effect pattern of a variable over different landmark times, with one conditional survival model for each landmark. Motivat…
Variable Selection in Predictive MIDAS Models
2014
In short-term forecasting, it is essential to take into account all available information on the current state of the economic activity. Yet, the fact that various time series are sampled at different frequencies prevents an efficient use of available data. In this respect, the Mixed-Data Sampling (MIDAS) model has proved to outperform existing tools by combining data series of different frequencies. However, major issues remain regarding the choice of explanatory variables. The paper first addresses this point by developing MIDAS based dimension reduction techniques and by introducing two novel approaches based on either a method of penalized variable selection or Bayesian stochastic searc…
Urban monitoring using multi-temporal SAR and multi-spectral data
2006
In some key operational domains, the joint use of synthetic aperture radar (SAR) and multi-spectral sensors has shown to be a powerful tool for Earth observation. In this paper, we analyze the potentialities of combining interferometric SAR and multi-spectral data for urban area characterization and monitoring. This study is carried out following a standard multi-source processing chain. First, a pre-processing stage is performed taking into account the underlying physics, geometry, and statistical models for the data from each sensor. Second, two different methodologies, one for supervised and another for unsupervised approaches, are followed to obtain features that optimize the urban rela…
Analysis of compatibility between lighting devices and descriptive features using Parzen’s kernel: application to flaw inspection by artificial vision
2000
We present a supervised method, developed for industrial inspections by artificial vision, to obtain an adapted combination of descriptive features and a lighting device. This method must be implemented under real-time constraints and therefore a minimal number of features must be selected. The method is based on the assessment of the discrimination power of many descriptive features. The objective is to select the combination of descriptive features and lighting system best able to discriminate flawed classes from defect-free classes. In the first step, probability densities are computed for flawed and defect-free classes and for each tested combination. The discrimination power of the fea…
A nondestructive intelligent approach to real‐time evaluation of chicken meat freshness based on computer vision technique
2019
In this study, the capability of a procedure based on combination of computer vision (CV) and artificial intelligence techniques examined for intelligent and nondestructive prediction of chicken meat freshness during the spoilage process at 4°C. The proposed system comprises the following stages: capture images, image preprocessing, image processing, computing channels, feature extraction, feature selection by a hybrid of genetic algorithm (GA) and artificial neuronal network (ANN), and prediction by using ANN. The number of neurons in input layer was determined 33 (selected features) and freshness used as the output. The ideal ANN model was obtained with 33‐10‐1 topology. The high performa…
Stagewise pseudo-value regression for time-varying effects on the cumulative incidence
2015
In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event s…
Why is this an anomaly? Explaining anomalies using sequential explanations
2022
Abstract In most applications, anomaly detection operates in an unsupervised mode by looking for outliers hoping that they are anomalies. Unfortunately, most anomaly detectors do not come with explanations about which features make a detected outlier point anomalous. Therefore, it requires human analysts to manually browse through each detected outlier point’s feature space to obtain the subset of features that will help them determine whether they are genuinely anomalous or not. This paper introduces sequential explanation (SE) methods that sequentially explain to the analyst which features make the detected outlier anomalous. We present two methods for computing SEs called the outlier and…